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Abstract —We study the performance of diffusion least-mean-square algorithms for distributed parameter estimation in multi-agent 
networks when nodes exchange information over wireless communication links. Wireless channel impairments, such as fading and 
path-loss, adversely affect the exchanged data and cause instability and performance degradation if left unattended. To mitigate these 
effects, we incorporate equalization coefficients into the diffusion combination step and update the combination weights dynamically 
in the face of randomly changing neighborhoods due to fading conditions. When channel state information (CSI) is unavailable, we 
determine the equalization factors from pilot-aided channel coefficient estimates. The analysis reveals that by properly monitoring the 
CSI over the network and choosing sufficiently small adaptation step-sizes, the diffusion strategies are able to deliver satisfactory 
performance in the presence of fading and path loss. 
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1 Introduction 

IFFUSION least-mean squares (LMS) algorithms can 
serve as efficient and powerful mechanisms for 
solving distributed estimation and optimization prob¬ 
lems over networks in real-time, in response to stream¬ 
ing data originating from different locations [l]-[5]. Ow¬ 
ing to their decentralized processing structure, simplicity 
of implementation, and adaptive learning capabilities, 
these algorithms are particularly well-suited for appli¬ 
cations involving multi-agent wireless networks, where 
energy and radio resources are generally limited [?], [8], 
[9]. Consensus strategies can also be used for distributed 
estimation purposes [10]-[15]. However, it was shown in 
[16] that for consfant sfep-size adapfation, network states 
can grow unbounded due to an inherent asymmetry in 
the consensus dynamics. The same problem does not 
occur for diffusion strategies, and for this reason, we 
focus on these algorithms in this work. 

Diffusion strategies have been widely investigated in 
networks with static topologies in which the commu¬ 
nication links between agents remain invariant with 
respect to time [2], [7], [17]-[22]. Under such conditions, 
these strategies converge in the mean and mean-square 
error sense in the slow adaptation regime [2], [3], [5], 
[16], [23]. Previous studies have also examined the effect 
of noisy communicafion links on the performance of 
these algorithms on network with static topologies [24]- 
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[27], The main conclusion drawn from these works is 
that performance degradation occurs unless the combi¬ 
nation weights used at each node are adjusted to counter 
the effect of noise. 

The sfatic link topology assumption, however, is re¬ 
strictive in applications in wireless communications and 
sensor network systems. For example, in mobile net¬ 
works where the agents are allowed to change their 
position over time, the signal-to-noise ratio (SNR) over 
the communication links between nodes will vary due 
to the various channel impairments, including path loss, 
multi-path fading and shadowing. Consequenfly, the 
set of nodes with which each agent can communicate 
(called neighborhood set) will also change over time, as 
determined by the link SNR, and the network topology 
is therefore intrinsically dynamic. It is therefore essential 
to study the performance of diffusion strategies over 
networks with time-varying (dynamic) topology and 
characterize the effects of link activity (especially link 
failure) on fheir convergence and sfability. 

The problem of link imperfection was also investi¬ 
gated in other classes of distributed algorithms, such 
as consensus [28]-[32] and subgradient algorithms [9], 
[33]. In [28], [29] and [33], the authors have examined 
the performance of consensus algorithms over networks 
with link failures, where links are established according 
to some predefined probabilities. They assumed that 
once a link is activated at a given iteration the data 
received through it will be undistorted. References [31], 
[32] have taken into account the effects of link and 
quantization noise in addition to link failure and inves¬ 
tigated the network convergence and stability. A more 
realistic network scenario was considered in [30], [34] 
where the probabilities of link failure are obfained using 
a fading charmel model and SNR of the received signals. 
However, the data received from a neighboring node is 
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assumed to be error-free when the corresponding link is 
active. 

In this paper, we study the performance of diffusion 
estimation strategies over networks with time-varying 
topologies where the information exchange between 
agents occurs over noisy wireless links that are also 
subject to fading and path loss^. Our contributions are 
as follows. We extend the application of diffusion LMS 
strategies from multi-agent networks with ideal commu¬ 
nication links to sensor networks with fading wireless 
channels. Under fading and path loss conditions over 
wireless links, the neighborhood sets become dynamic, 
with nodes leaving or entering neighborhoods depend¬ 
ing on the quality of the links as defined by the instan¬ 
taneous SNR conditions. Our analysis will show that if 
each node knows the channel state information (CSI) of 
its neighbors, the effects of fading and path-loss can be 
mitigated by incorporating local equalization coefficients 
into the diffusion updates. When CSI is not available 
to the nodes, we explain how the equalization coeffi¬ 
cients can be evaluated from a pilot-assisted estimation 
process along with the main parameter estimation task 
of the network. We also examine the effect of channel 
estimation errors on the performance and convergence of 
the modified algorithms in terms of a mean-square-error 
metric. We establish conditions under which the network 
is mean-square stable for both known and unknown 
CSI cases. The analysis reveal that when CSI is known, 
the modified diffusion algorithms are asymptotically 
unbiased and converge in the slow adaptation regime. 
In contrast, the parameter estimates will become biased 
when the CSI are obtained through pilot-aided channel 
estimation. Nevertheless, the size of the bias can be 
made small by increasing the number of pilot symbols 
or increasing the link SNR. 

The paper is organized as follows. In Section 2, we 
explain the network signal model. In Section 3, we 
review the standard diffusion strategies and introduce 
a modification for distributed estimation over wireless 
networks. We analyze the convergence and stability of 
the proposed algorithms in Section 4. We present the 
simulation results in Section 5, and conclude the paper 
in Section 6. 

Notation: Matrices are represented by upper-case and 
vectors by lower-case letters. Boldface fonts are reserved 
for random variables and normal fonts are used for 
deterministic quantities. Superscript (-j^ denotes trans¬ 
position for real-valued vectors and matrices while (•)* 
denotes conjugate transposition for complex-valued vec¬ 
tors and matrices. The symbol E[-] is the expectation 
operator, Tr( ) represents the trace of its matrix argument 
and diag{-} extracts the diagonal entries of a matrix, 
or constructs a (block) diagonal matrix using its argu¬ 
ment. A set of vectors are stacked into a column vector 
by col{ }. The vec( ) operator vectorizes a matrix by 

1. A short preliminary version of this work was presented in the 
IEEE International Conference on Communication (ICC), June 2013 
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stacking its columns on top of each other and bvec( ) 
is the block-vectorization operator [1]. The symbol 0 
denotes the standard Kronecker product, and the symbol 
0b represents the block Kronecker product [1]. 

2 Network Signal Model 

Consider a set of N sensor nodes that are distributed 
over a geographical area. At time instant i C {0,!)•••}/ 
each node fee {1,2, • • • , N} collects data dk{i) and Uk,i 
that are related to an unknown parameter vector w° C 
(pMxi .yjg following relation: 

dk{i) = Uk,iW° + Vk{i) ( 1 ) 

where dk{i) € C, Uk,i G and Vk{i) G C are, re¬ 

spectively, the scalar measurement, the node's regression 
vector and the measurement noise. 

Assumption 1. The variables in the linear regression model 
(1) satisfies the following conditions: 

a) The regression vectors {uk^i} are zero-mean, i.i.d. in 
time, and independent over space, with covariance ma¬ 
trices Ru,k = E[ul -Uk,i] > 0. 

b) The measurement noise {nfe(i)} are zero-mean, i.i.d. in 
time, and independent over space, with variances cr^ k- 

c) The regression vectors Uk^^i^ and the noise Vk^ifz) nfe 
mutually independent for all ki, k 2 , ii and U. 

Node i is said to be a neighbor of node k if its distance 
from node k is less than a preset transmission range Tq 
[36], which for simplicity is assumed to remain constant 
over the given geographical area. The set of all neighbors 
of node k, including node k itself, is denoted by A4. 
Nodes are allowed to communicate with their neighbors 
only, but due to channel impairments, certain links may 
fail. Hence, at any given time i, only a subset of the 
nodes in A4 can communicate with node k. 

The objective of the network is to estimate the un¬ 
known parameter vector w° in a distributed marmer 
when the data exchange between the agents occurs over 
noisy wireless links that are also subject to fading and 
path loss. In particular, we assume that the transmit 
signal j G from node £ G Afk\{k} to node k 

at time i experiences channel distortion of the following 
form (see Fig. 1): 

( 2 ) 

y '£,k 

where ^|Jik,i ^ is the distorted estimate received by 

node k, G ^ denotes the fading channel coefficient 

over the wireless link between nodes k and £, Pt G M"*" 
is the transmit signal power, r^^k = Vk^e G R"*" is the 
distance between nodes £ and k, a G R’*' is the path loss 
exponent and G is the additive noise vector 

with covariance matrix Im- We define ifkk.i — '^k,i 
to maintain consistency in the notation. 

Assumption 2. The fading channel coefficients and the link 
noise in (2) satisfy the following conditions: 
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This expression shows that the probability of success¬ 
ful fransmission decreases as the distance between two 
nodes increases. As such, the link between neighboring 
nodes is not guaranteed to be connected all the time, 
implying that the network topology is time-varying. 
Under this condition, we redefine the neighborhood set 
of node fc as a time-varying set consisting of all nodes 
I € A4 for which exceeds provided that node 

k knows the CSl of nodes ^ € A4 ■ In this way, the effec¬ 
tive neighborhood set of each node k becomes random 
and we, fherefore, denote it by Afk,i- This implies that 
J\fk,i C A4 for all i. 


Fig. 1: Node k receives distorted data from its nik = |A4| neighbors 
at time i. The data are affected by channel fading coefficients, 
and communication noise 


a) The time-varying channel coefficients hi^k{i) follow the 
Clark's model [37], i.e., they are independent circular 

Gaussian random variables with zero mean and variance 
2 

^h,ik- 

b) are independent over space and i.i.d. over time. 

c) The noise vectors {v^gt\} zero-mean, i.i.d. in time and 
independent over space. 

d) The channel coefficients, hi^kAii)/ noise vectors, 

12 ' regression vectors, Uk^^i^ and the measure¬ 
ment noise, Vki{iA), are mutually independent for all kj 
and ij with j G {1,2,3,4}. 

It is also assumed that nodes are aware of fhe positions 
of fheir neighbors through some positioning techniques 
and, therefore, re^k, (- G A4 is known to node k. A 
transmission from node £ to node k at time i is said 
to be successful if fhe SNR befween nodes £ and k, 
denofed by <;tk{i), exceeds some threshold level c}},. The 
threshold level is defined as fhe SNR in the non-fading 
link scenario and is compufed as: 


^tk = 


2N) c. 

^v,lk ' o 


(3) 


In fading condifions, fhe instanfaneous SNR is: 


‘itki'i) 


|b-Gfe(*)Pht 


. W 


a 


(4) 


When transmission is successful, we have <;tk{i) > ^ik 
which amounfs fo the condition: 


\hi,k{i)\'^ > vi^k (5) 

where v^^k = (-^)“- Since has a circular complex 

Gaussian distribution, the squared magnitude \h^^k{i)\'^ 
is exponentially distributed with parameter A} ^ = 

[38]. Considering this fact, the probability of 
successful transmission is then given by: 

Pt,k = Pr(^|h^,fc(z)p > (6) 


3 Distributed Estimation over Wireless 
Channels 

We first briefly review fhe sfandard diffusion LMS strate¬ 
gies for estimation of w° over multi-agent networks with 
ideal links. We then elaborate on how to modify these 
strategies to enable the estimation of w° in the presence 
of fading and wireless charmel impairments. 

3.1 Diffusion Strategies over ideai Communication 
Channeis 

In the context of mean-square-error estimation, diffusion 
sfrafegies are stochastic gradient algorithms that can be 
used for fhe distributed minimization of the following 
global objective function [2], [3]: 

N 

= ^E\dkii) -Uk^iw\‘^ (7) 

k^l 

There are various forms of diffusion depending on the 
order in which the relevant adaptation and combination 
steps are performed. The so-called Adapt-fhen-Combine 
(ATC) sfrafegy fakes the following form: 

i = Wk,i-1 -b PkU*k^i [dkii) - Uk,iWk,i-i] (8) 

Wk,i = ^ ae^ki’t,i (9) 

i&Afk 

where > 0 is the step-size used by node k, and the 
fc denote normegative entries of a left-stochastic matrix 
A that satisfy: 

ai^k = 0 if £^Afk and ^ ag^k = 1 (10) 

i&Afk 

In fhis implementation, (8) is an adaptation step where 
node k updates its intermediate estimate Wk,i-i to i/);. j 
using its measured data {uk,i,dk{i)}. Then (9) is a 
combination step in which each node k combines its 
intermediate estimate xp). ^ with that of its neighbors to 
obtain Wk^i- 

While the above algorithm works well over ideal com¬ 
munication channels, some degradation occurs when the 
exchange of information between neighboring nodes is 
subject to noise, as explained in [?], [24]-[26], [39], [40]. 
In this work, we move beyond these earlier studies and 
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examine the performance of diffusion strategies over 
fading wireless charmels. We also suggest modifications 
to the update equations to counter the effect of fading. 

3.2 Diffusion Strategies over Wireiess Channeis 

We are initially motivated to replace the combination 
step in (9) by 

Wk,i = ^ ( 11 ) 


Algorithm 2 : Diffusion CTA over Wireless Channels 

(17) 

ieMk.i 

Wk,i = [d-kii) - Uk,^^Pk,^-l] ( 18 ) 


sets, J\fk,i, are also evolving with time. Moreover, they 
need to satisfy 


where ■ 0 ^ ^ is a refined version of the distorted estimate 
ij^ik^i that node k receives. The refinement is computed 
through a scaling equalization step of the form: 

( 12 ) 

where the scalar gain ^(i) is an equalization coefficient 
to be chosen to counter the effect of fading. Recall that 
j is related to , via (2). Moreover, since each node 
k uses data from nodes i G Afk whose instantaneous 
SNR, C£fc(f), exceeds the threshold then we need to 
further adjust (9) and replace A4 and ae^k, respectively, 
with J\fk,i and a^_fc(i). This leads to: 

Wk,i = °-t.k{^)9i,k{i)i>tk,i ( 13 ) 

Therefore, in wireless sensor networks, the ATC diffu¬ 
sion strategy takes the form presented in Algorithm 1. 


Algorithm 1 : Diffusion ATC over Wireless Charmels 


V’fc.i = Wk,i-i + /ifeMfe,* [dk{i) - Uk^iWk^t-i] (14) 
Wk,i = ^ ai^k{i)9t,k{^)'*l’tk,^ ( 15 ) 


One way to compute the equalization coefficients in (42) 
is to employ the following zero-forcing type construc¬ 
tion: 


9i,ki^) = 


! ‘’’I.k 

\hi,k{i)\'^ y Pt 

1 


itiGMkAik} 

\{ 1 = k 


(16) 


Alternatively, if the noise variances crAfk known, 
then one could also use minimum mean-square-error 
(MMSE) estimation to obtain the equalization coeffi¬ 
cients. For simplicity, we continue with (16). By switch¬ 
ing the order of the adaption and combination steps in 
Algorithm 1, we will obtain the Combine-then-Adapt 
(CTA) diffusion strategy, which is presented below as 
Algorithm 2. In (17), W(,k,i is the estimate of the global 
parameter at node i that rmdergoes similar path loss, 
fading and noise as 'il^ik,i described by ( 2 ). 

The combination coefficients ai^i) in (13) now become 
random and time-dependent because the neighborhood 


a^,fc(*) = 0 if i^J\fk,i and ^ a£_fc(i) = 1 (19) 

The randomness of anA'i) can be further clarified by 
resorting to (5). The communication between nodes i. 
and k is successful if (5) is satisfied; otherwise, the link 
between them fails. When the link fails, the associated 
combination weight a^,fc(i) must be set to zero, which in 
turn implies that other combination coefficients of node 
k need to be adjusted to satisfy (19). This suggests that 
the neighborhood set Afk,i has to be updated whenever 
one of the neighborhood link SNR crosses the threshold 
in either direction: 

A/" k,i = G A/fcI ^ik{i) ^ j" (20) 

In practice, since c^fe(f) may not be measurable, we use 
(3)-(4) and (5) to update the neighborhood set as: 

A/” k,i = e A4| \he.k(i)\^ ^ (21) 

Motivated by these considerations, we propose the fol¬ 
lowing dynamic structure to adjust the combination 
weights over time: 

J if £ G A/'fc,i\{fc} 

( 22 ) 

where the 7 ^ ^ are fixed, positive combination weights 
that node k assigns to its neighbors £ G Afk,i- To ensure 
afc.fc(f) > 0 , these weights need to satisfy: 

E TGfe < 1 (23) 


It can be verified that if each node k obtains the co¬ 
efficients 'je^k for the time-invariant neighborhood set 
A 4 according to well-known left or doubly-stochastic 
matrix combination rules (e.g., imiform averaging rule or 
Metropolis rule) then the condition (23) will be satisfied. 
In (22), the quantity Xe^k{i) is defined as: 


1, if G A/" k,i 
0 , otherwise 


(24) 


When transmission from node i to node k is successful 
Ie,k{i) = 1/ otherwise, Xi^k{i) = 0. In this way, the entries 
ai^k{i) satisfy condition (19). From (20) and (24), we see 
that the indicator operator, 7 .( 1 ), is a random variable 
with Bernoulli distribution for which the probability of 
success, pe^k, is given by the exponential function ( 6 ). 
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3.3 Modeling the Impact of Channel Estimation Er¬ 
rors 

In Algorithms 1 and 2, it is assumed that each node k 
knows the channel fading coefficients fc(*)/ which are 
needed in (16). In practice, this information is usually 
recovered by means of an estimation step. Consequently, 
some additional estimation errors will be introduced into 
the network. 

There are many ways by which the fading coefficienfs 
can be estimated. For example, we may assume that the 
transmitted data from node £ fo node k carries two data 
types, namely, pilot symbols (training data) denoted by 
si{i), and data symbols ipi ^ or The training data 

are used for channel esfimafion and fhe data symbols are 
the intermediate estimates of fhe unknown parameter 
vector, w°, which are used to update the network esti¬ 
mate at node k. According to (2), the received training 
data at node k and time i is affected by fading and noise, 
i.e.. 


VeA^) = he,kA\ (25) 

y ^^,fc 

where is a zero-mean additive white Gaussian 

noise with variance It is reasonable to assume that 
The number of framing symbols used 
depends on fhe specific application requirements and the 
time scale variations of fhe channel. If we use a single 
training data to estimate each coefficient and assume 
that nodes k G {1,2 , • • • , N} sends Sfc(i) = 1 as training 
symbols, the least-squares estimation method gives the 
following esfimafe: 


he,k{i) 



(26) 


Remark 1. Ifive use an alternative way to find the threshold 
SNR, j. without using distance information, then (25) 

can be expressed as Vi^kA = f^e,kA^eA + '^^ekA)> ^here 
l^e,kA = hg^ki'i)(,PtlrfA^'^- this form the fading coeffi¬ 
cient and path loss are combined into a new channel coefficient 
(3gA) implicitly includes the distance information. In 
this case, to estimate the channel coefficients, f3gf.{i)' unlike 
(26), the distance information are not required. 

From (25), if can be seen thaf VgA'^) composed 
of fhe sum of fwo independent circular Gaussian ran¬ 
dom variables. It follows thaf ygA) will have circu¬ 
lar Gaussian disfribufion wifh zero mean and variance 
A tkA^ T '^^ifik- From (26), we fherefore conclude thaf 

hg.kii) has circular Gaussian distribution with zero mean 
and variance ^rrd |/i|,fe(i)p has exponen¬ 

tial distribution with parameter 


^e,k = 


'h.tk 






From here the probability of successful transmission 
from node £ fo node k will be defined in terms of fhe 
esfimafed charmel coefficient as 

Pe,k = Pr(|/^,.fe(^)p > ug^k) = (28) 


Considering the assumed training data and from (25) 
and (26), fhe insfantaneous channel estimation error will 
be 


h,r.k{f) — h,£,kii^ — t 


/ 


Therefore, the variance of fhe estimation error is: 


(29) 


a? =E|h,,..(z)|^ = ^a« 


.ik 


(30) 


which shows that the power of the channel estimation er¬ 
ror, (T~ , decreases if the node transmit power increases 

or if the distance between nodes £ and k decreases. 
To reduce the channel estimation error, the alternative 
solution is to use more pilot data. It can be shown 
that if fhe wireless charmel remains invariant over the 
transmission of n pilot data, then the estimation error 
variance will be scaled by a factor of 1/n [41]. 


Remark 2. The time index i, in Algorithms 1 and 2, refers 
to the iteration number of adaptation and combination steps 
and not the time at which the communication betiueen nodes 
occurs. This implies that from time index i — 1 to i, a 
node may transmit several training symbols to its neighbors 
for channel estimation process and, therefore, the estimated 
channels used in iteration i may be obtained using several 
pilot data. However, to simplify the presentation, we also use 
index i to represent the communication time of pilots in (25) 
since it is assumed that a single pilot datum used for channel 
estimation. 


We can now express (2) in terms of the estimated 
charmels hg^kif) and the charmel estimation error as 


i>ik,i 


hg^kii) 




hg^k{i)ipg,i + V 


(£>) 

ek,i 


(31) 


The equalization coefficients QgA'^) computed using 
the estimated charmels hg^k{i), according to (16). Using 
this construction, the equalized received data at node k 
become: 


9g,kA'^ik,i = (^ + 9g,kA^ 

+ 9g,k(.^)'"ek!i ( 32 ) 

Substituting the equalized data into (42), we obtain: 

Wk,i= o-i,k{i)i’e,t+ + (33) 

where 

eg^k{i) = -ag^kAggA^Aek 

^Tk{i)9g,kA^itl 


(27) 


(34) 

(35) 
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There are several important features in the combina¬ 
tion step (33) that need to be highlighted. First, the com¬ 
bination coefficients, used in this step are time 

varying. These coefficients, in addition to combining 
the exchanged information, model the link failure phe¬ 
nomenon over the network. Second, {g^ /;(*)} accormt for 
the effects of fading charmels. Using these variables and 
the control SNR mechanism introduced above, we can 
reduce the effect of link noise. Third, in (33), {e^fc(*)} 
model the charmel estimation errors, which allows us 
to examine the impact of these errors on the diffusion 
strategies. 

In summary, in a multi-agent wireless network, each 
node k will perform the processing tasks listed in Table 
I in order of precedence to complete cycle i of the ATC 
diffusion LMS algorithm. 


TABLE I: ATC diffusion implementation 


= l if^eA4\W 


if ^ = A: 


AC k,i = G A4| > r'ffcj 




Qp Ji) = { V Pi 

f 1, if £ G Mk,i 
^e,k{i) ■( Q otherwise 


if £ G J\fk,i\{k} 


ai,kii) = 


-ti,kli,k{i), if £ G A/’fe,i\{fc} 


1 - if£ = fc 

xpk i = Wk,i-i + gkU*k^i [dkii) - Uk,iWk,i-i] 
Wk,i = 


(36) 

(37) 

(38) 

(39) 

(40) 

(41) 

(42) 


^eA£fc 


4 Performance Analysis 

In this section, we derive conditions under which the 
equalized diffusion strategies are stable in the mean 
and mean square sense. We also derive expressions 
to characterize the mean-square-deviation (MSD) and 
excess mean-square-error (EMSE) performance levels of 
the algorithms during the transient phase and in steady- 
state. We focus on the ATC variant (41)-(42). The same 
conclusions hold for (17)-(18) with minor adjustments. 

To derive a recursion for the mean error-vector of the 
network, we begin with defining the local error vectors: 

Wk,i =W° - Wk,i (43) 

^k,i = w° - (44) 


We subtract w° from both sides of (41) and (33) to obtain: 

= {1- fJ.kU*k,iUk,i)wk,i-i - gkU*k^,Vk{i) (45) 

Wk,i = ^ ap^k{i)^t.,i + ^ ep^k{i)^i,^ 

+ ep^k{i)w° - (46) 

We collect the {ap^kii)} into a left-stochastic matrix Ai 
and the {e^ ^(i)} into an error matrix Ei. We also define 
the extended versions of these matrices using Krocecker 
products as A 4 = Ai0 Im and Si = Ei®lM- We further 
introduce the network error vectors: 

=col{^ii,'02,*:---)^’v.J (47) 

tbj =col{mi,i,t(;2,i,...,mAr,i} (48) 

and the variables: 

7?., = diag|it* • • • , (49) 

M = diag|/ri/M, • • • , (50) 

p, = co[{ul^vi{i),--- (51) 

= col{nJ^.^, • • • , (52) 

uj° = In <Si w° (53) 

where l^r is a column vector with length N and unit 

entries. We can now use (45) and (46) to verify that the 
following recursion holds for the network error vector: 

- {A 4 + Si)^Mpi + Si^uj° - (54) 

where 

B, = {A, + Sif(I-MTl,) (55) 


4.1 Mean Convergence 

Taking the expectation of (54) under Assumptions 1 and 


2 , we arrive at 

E[w,] = BE[w,_i] + S'^uj° (56) 

where 

B = E[Bi] = {A + Sf{I-Mn) (57) 

A ^ E[A] = A® Im (58) 

S ^ E[S,] =E®Im (59) 

= E[7?.i] = diag{i?„,i,..., i?„,Ar} (60) 


To obtain (56), we used the fact that Vk{i) is independent 
of Uk^i and E[nfe(i)] = 0. Moreover, we have E[n(’^^] = 0 
because Qp^ki^) is independent of and = 0 . 

Considering the time-varying left-stochastic matrix At, 
we can use ( 22 ) to find the entries of A = E[Ai], i.e., 

_ f le,kPe,k, if £ G A4\{fc} 

\ 1 - if£ = fc 


(61) 
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Observe that A^t. = 1 . The {(., fc)-th entry of matrix E is 
zero on the diagonal and, for ^ ^ k, is given by: 




(b 


= -7£,fc E [It^k (0] 


{a) 


= -7^,feE \hi^u{i)\^ > vi,k 


(*«) TP 

= -7£,fc E 




(- 


\hi,kA) + 

{]ht,k{i) + > vt,h) 


iv)/A\\2 


(62) 


The equality in step (ii) follows from the fact that g^^Ai) 
is defined for £ s Afk\{k} when \hi^k{i)\^ > ^i^k, for 
which Xe,k{^) = 1- We obtain (iii) by expressing ;,.(*) 
in terms of hi^k{i) arid according to (25), (36) and 

(38). Expression (62) indicates that ei^k is bounded. 


Remark 3. From the right hand side of (62), it can be 
verified that the value of the expectation is independent of time 
since the estimation error, (i), and the channel coefficients, 
he^kii), kire assumed to be i.i.d. over time with fixed probability 
density functions. 


According to (56), when B is stable, then the network 
mean error vector converges to 

b 4 lim E[m,] = (/ - B)-^£^u)° (63) 

2—>00 


If hi^kii) = hi^kii) then £ = Q and limi^oo = 0 , i.e., 
the algorithm will be asymptotically unbiased. 

Let us now find conditions under which B is stable, 
i.e., conditions under which the spectral radius of B, 
denoted by p{B), is strictly less than one. We use the 
properties of the block maximum norm || • ||t,^oo from [3], 
[42] to establish the following relations: 


P{B) < ll^lk.oo 

< \\{A + BfWb.ooUl - MTZ)\\b,o. 

< (M^IU.oo + ll^^lk.oo) ||(/ - ^f7^)||f,,oo 

= (l + ||f^||b.oo)||(/-7W7^)||f,,oo (64) 


where in the last equality we used the fact that 
||-4^||b,oo = 1 since A is left-stochastic. According to (64), 
p{B) is bounded by one if 

||(/-^t7^)|U,oo<—(65) 

t + \\o ||b,oo 


Since I — AiTZ is block diagonal and Hermitian, we have 
11(7 — A17^)||b_oo = p{I — MTV) [3]. The spectral radius of 
I — MTZ will be less than 1/(1 + ||iS||b,oo) if the absolute 
maximum eigenvalue of each of its blocks is strictly less 
than 1/(1 -f ||f ||f,,oo)- This condition is satisfied if at each 
node k the step-size pk is chosen as: 


^ l+llgV.oo 

^ma^{Bu,k') 


< Pk < 


1-6 


i+l|£|| 


'^max {Ru,k) 


( 66 ) 


where A„iax(’) denotes the maximum eigenvalue of its 
matrix argument. This relation reveals that the mean- 
stability range of the algorithm, in terms of the step size 
parameters {pk}, reduces as the channel estimation error 
over the network increases. When the channel estimation 
error approaches zero^, that is when ||6’||&,oo —>■ 0, the 
stability condition reduces to 0 < pk < ATrf'TTT' 
is the mean stability range of diffusion LMS over ideal 
communication links [3]. A similar analysis can be car¬ 
ried out for the CTA diffusion strategy. 

Theorem 1. Consider the diffusion strategies (41)-(4:2) with 
the space-time data (1) and (2) satisfying Assumptions 1 
and 2, respectively, and where the channel coefficients are 
estimated using (26) with training symbols Sk{i) = 1. Then 
the algorithms will be stable in the mean and the mean 
error vector will converge to (63) if the step-sizes are chosen 
according to (66). 

4.2 Steady-State Mean-Square Performance 

To study the mean-square performance of the algorithm, 
we need to determine the network variance relation [1], 
[26], [43]. The latter can be obtained by equating the 
weighted squared norms of both sides of (54), and taking 
expectations under Assumptions 1 and 2: 

E\\w4l=E\\m_Al.+E[A*£fE£f a;°] 

+ E[p*M^{A, + )E(A + 

+ 2Re{E[w°*£*^EBA*-i]} + E[u(^)*Eu(’^)] (67) 

where for a vector x and a weighting matrix E > 0 with 
compatible dimensions ||x|||. = x*Ex, and 

S' = BfEB, (68) 

Under the independence assumption between Wi-i and 

IZi, it holds that 

E[||m,_i|||J =E||m,_i|||[j,,] (69) 

Using this equality in (67), we arrive at: 

E||m,|||=E||m,_i|||,+Tr(E[5fu;°*cc°5fE]) 

+ Tr(E[(A + £^fMp,p*M{A, + £f)E]) 

+ 2Re{Tr(E[B,m,_iu;°*£/^E])} +Tr(EK('^^u,^'^^*E]) 

(70) 

where E' = E[S']. To compute (70), we introduce: 

V = E[p^p*] = diag{alT^Ru,i,--- , } (71) 

TZy = diag{ A,i • • • , Rv,n} (72) 

R^,k ^ = Yl \gpki^)\"]Ri% 

rej^k\{k} 

(73) 

We show in Appendix A how to compute the expec¬ 
tation term multiplying in (73). Alternatively, this 

1. The channel estimation error can be reduced by transmitting more 
pilot symbols or increasing the SNR during pilot transmission. 















term can be evaluated numerically by averaging over 
repeated independent experiments. 

To proceed, we assume that E is partitioned into block 
entries of size M x M and let a = bvec(E) denote 
the vector that is obtained from the block vectorization 
of E. We shall write ||'*hi|l|. and interchangeably 

to denote the same weighted square norm [1]. Using 
properties of bvec and block Kronecker products [44], 
the variance relation in (70) leads in steady-state to: 

lim E||m*||^ = lim -h y^cr (74) 

l—¥00 2—>-00 

where = E[Sf (gf, B*], and 

7 = lim IeIsJ bvec((w°w°*)^) 

+ E[(A + ®b (A + sff] hvec(Mr^M) 

+ 2Re{E[B, ®b bvec((&a;°*)^}} -h bvec(i?^) (75) 

Considering (55), matrix J' can be written as: 

J- = e| [(/ - M7lif(A, + Si)] 

®b [{A + efW-MTli)]} 

= e| [(/ - MTtif ®b (I - Mlti)] 

X [(A -f £i) ®b {Ai + £f)] } (76) 

Since the entries of matrix Tti, which are defined in terms 
of the regression data are independent of the entries 
of matrices A arid £i, i.e., and matrix 

in (76) can be written more compactly as: 

T = :FV (77) 

where 


T = E[{I- Mn,f 06 (/ - 
V 4 E[X>,] = E f(A + ^*) 0b (A + £f) 


(78) 

(79) 


We can find an expression for T if we assume that the 
regression data ^ are circular Gaussian—see equation 
(80) and Appendix B, where is a imit basis vector 
in with entry one at position k, Vk = yec{Ru,k), 
/? = 2 for real-valued data and /3 = 1 for complex¬ 
valued data. A simplified expression can be found to 
compute R without using the Gaussian assumption on 
the regression data provided that the following condition 
holds. 


Assumption 3. The channel estimation errors over the net¬ 
work are small enough such that the adaptation step-sizes in 
(66) can be chosen sufficiently small. 

In cases where the distribution of the regression data is 
unknown, under Assumption 3, the contributing terms 


depending on pf can be neglected and as a result (F can 
approximated by 

« [(/ - MTZf 06 (/ - MTZ)] (81) 

In Appendix C, we show how to obtain the matrix V in 
(79) needed for computing (F in (77). To evaluate 7, we 
use the following relations, which are also established in 
Appendix G: 

E[ff 06 £i\ = ^[{Ei 0 E*)] 0 Im2 (82) 

E[(A + fi)'^®f> (A^ + £t)] 

= (E[(Ai 0 Aif] +E[{aT 0 E*)] 

E[(£/, 0 Ai) ] -t- E ^Ei 0 Ei ] ^ 0 2m2 (^3) 

E ^Bi 06 (E[^(Ai 0 Ei) ] -t- E[^£/i 0 Ei ]) 0 j- 

X |(/mjv — MTZ) 06 /mjvj- (84) 

To obtain mean-square error (MSE) steady state expres¬ 
sions for the network, we let i go to infinity and use 
expression (74) to write: 

lim = y^cr (85) 


Since we are free to choose E and hence a, we choose 
{I—(F)a = bvec(f2), where U is another arbitrary positive 
semidefinite matrix. Doing so, we arrive at: 

lim E||m,||^ = 7 '^(/ - E)-^hYec(yt) (86) 

i—¥oc 


Recall from (48) that each sub-vector of Wi corresponds 
to the estimation error at a particular node, for instance, 
Wk^i is the estimation error at node k. Therefore, using 
(86), the MSD at node k, denoted by rjk, can be computed 
by choosing U = {diag(efe) 0 /}, i.e.: 


rjk = lim E||mfc,i||^ 

i—¥C !0 


lim Elluiill 
2—^00 


2 

{diagtefc)®/} 


= 7^(/-J') ^bvec(diag(efc) 0 /m) 


(87) 


The network MSD, denoted by y, is then defined as: 

1 ^ 

?? = .lim ( 88 ) 

I —^00 iV ^ ^ 
k^l 

which if can be also compufed from (86) by using O = 
This leads to: 

= -^7^(2-7^)“^bvec(7MJv) (89) 

2—>-00 i\ I\ 

In (87) and (89), we assume that (/ — (F) is invertible. 
In what follows, we find conditions under which this 
assumption is satisfied. Using the properties of the Kro¬ 
necker product and the sub-multiplicative property of 
norms, we can wrife: 

p{E) < \\m\b,oo < ll-^llb.ooll^ll'’.- (90) 


N 

[diag((vec(diag(efe))) 0 (/3 - ® Rk,u) + rkvl 


k^l 


T ={I - MU)^ 0b (/ - MU) + 


(M(g>bM) (80) 
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We next show that T from (81) is a block diagonal 
Hermitian matrix with block size NM^ x NM'^. To this 
end, we note that I — A4TZ is a block diagonal matrix 
with block size M x M and then use (81) to obtain: 

= diag|(/ - /rii?„,i)^ 0 (/ - MTZ), 

•••,(/- fiNRu,Nf 0 (/ - MU )} (91) 

Moreover, iF is Hermitian because considering TZ = TZ*, 
M = M^, TZM = MTZ, we will have 

F* = ((/ - MTZf)* 0b (/ - MTZy 

= {I- MTZf 0b (/ - MTZ) = F (92) 

Now we can use the following lemma to boimd the 
spectral radius of matrix F in (90). 

Lemma 1. Consider an N x N block diagonal Hermitian 
matrix Y = diag{Fi, F 2 , • • • , Tv}/ rvhere each block Yk is of 
size M X M and Hermitian. Then it holds that [3]: 


||H||b,oo 


max p(Yfc) = p{Y) 

l<k<N 


(93) 


According to this lemma, since F is block diagonal 
Hermitian, we can substitute its block maximum norm 
on the right hand side of relation (90) with its spectral 
radius and obtain: 

p(R) < p((T - MTZf 0b (/ - 7W7^)) ||P||b.oo 

= p\l - MTZ) \\V\y^ (94) 

We then deduce that p{F) < 1 if: 

0<P(-^--^^)< (95) 

V ll^llkoo 

Since I — MTZ is cl block-diagonal matrix, this condition 
will be satisfied for small step-sizes that also satisfy: 

1_ 1 1 + 1 

- Fr —T < - Fr —T 

'^Ynax\-^u.,k ) '^ma^\-^u.,k ) 

If the charmel estimation error is small, then ||f||b,oo ~ 
0 and V ss A ®b A. Subsequently, llPlIb.oo ~ 1 

and this mean-square stability condition reduces to 
0 < pfc < ^ ^ which is the mean-square stability 

range of diffusion LMS over ideal communication links 

[3]. 


4.3 Mean-Square Transient Behavior 

In this part, we derive expressions to characterize the 
mean-square convergence behavior of the diffusion algo¬ 
rithms over wireless networks with fading channels and 
noisy communication links. To derive these expressions, 
it is assumed that each node knows the CSl of its 
neighbors, and Ei = 0 for all i. We then use (67) and 
consider Wk,-i = 0, Vfc G {1, • • • , N} to arrive at: 

i 

= (97) 


where 

7 = E[Af 0b «4i]bvec(AtP^At) -I- bvec(.R^) (98) 

TZv = diag{.R„,i • ■ • , Rv,n} (99) 

Rv,k = E[alk{i) \ge,kii)\^]Tii% 

( 100 ) 

Under this condition, and since Ei = 0, F can be 
expressed as: 

Fs^EE[Aj ( 101 ) 

Writing (97) for i — 1 and computing —E||ii;i_i|l^ 

leads to: 


E||m,||2 = E\\w,_F\l + +7^-^V (102) 

By replacing a with Umscifc = diag(efc) 0 Im and CTemsefe = 
diag(efc) 0 Ru,k, we arrive at two recursions for the 
evolution of the MSD and EMSE over time: 

Pk{i) = 7fc(i - 1) - -b 7^-TVmsd, 

(103) 

Cfc(*) = Cfe(* ~ 1 ) ~ W'W II.Fq/-.?')crenisefc + 7 Uemsefc 

(104) 

We can find the learning curves of the network MSD 
and EMSE either by averaging the nodes learning curves 
(103) and (104), or by, respectively, substituting the fol¬ 
lowing two values for a in recursion (102): 

Cmsd = ^bvec(/MAr) (105) 

Uemse = ^bvec (diag{i?„,i, • • • , Ru,n}) (106) 


5 Numerical Results 


In this section, we present computer experiments to 
illustrate the performance of the ATC diffusion strategy 
(41)-(42) in the estimation of the unknown parameter 
vectors” = 2[l-|-jT, — l-bjl]^ over time-varying wireless 
channels. We consider a network with A = 10 nodes, 
which are randomly spread over a imit square area 
ix,y) £ [0, 1] X [0, 1], as shown Fig. 2. We choose the 
transmit power of Pt = 1, nominal transmission range of 
To = 0.4 and the path-loss exponents a = 3.2. For each 
node fc G {1,2, • • • , A}, we set = 0.01 and w^-i = 0. 
We adopt zero-mean Gaussian random distributions to 
generate Vk{i), and Uk^i- The distribution of the 
communication noise power over the spatial domain 
is illustrated in Fig. 3. The regression data ^ have 
covariance matrices of the form R^^k = The trace 

of the regression data, Tr(i?„_fc), and the variances of 
measurement noise, tr^ are illustrated in Fig. 4. 

The exchanged data between nodes experience distor¬ 
tion characterized by (2). At time i, the link between 
nodes i and k fails with probability 1 — pt^k- We obtain 
fe using the relative-degree combination rule [2], [3], 


i.e.. 


le,k = 


\m 

UmeAfj. Wm| ’ 

0 , 


if£ G A4 
otherwise 


(107) 
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Fig. 2: This graph shows the topology of the wireless network at the 
start-up time i = 0, where two nodes are connected if their distance is 
less than their transmission range, ro = 0.4. 
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Fig. 3: Power of communication noise over the network. 


and update Ai it at each time i according to the intro¬ 
duced combination rule (22). 

Figures 5 and 6 show the network MSD in transient 
and steady-sate regimes, where the simulation curves 
are obtained from the average of 500 independent runs. 
In these figures, we compare the performance of the 
proposed ATC diffusion algorithm over wireless chan¬ 
nels for different CSI cases at the receiving nodes. In 
particular, we examine the performance of the algorithm 
with perfect CSI, where each node k knows the CSI of all 
its neighbors. We also consider scenarios where nodes do 




Node k 

Fig. 4: Network energy profile. 
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Fig. 5: Learning curves of the network in terms of MSD 
and EMSE. 
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Eig. 6: Steady-state MSD over the network. 


not have access to the CSI of their neighbors and obtain 
this irrformation using one and two samples pilot data. 
Eor reference, we also illustrate the performance of ATC 
diffusion over ideal communication links in which the 
communication links between nodes are error-free, i.e., 
for each node k, = 4’i,i for all i. 

The best performance in fhese experiments belongs to 
the diffusion strategy that rims over network with ideal 
communication links. As expected, the diffusion strategy 
with perfect CSI knowledge outperforms diffusion straf- 
egy wifh charmel estimation using one or two samples 
pilot data, respectively, by 5dB and 7dB. In particular, the 
steady-sate mean-square performance of the algorithm 
improves almost by 2dB for an additional sample of 
pilot data used for channel estimation. Therefore, if the 
wireless channels are slowly-varying, by using a larger 
number of pilot data, it is possible to approach the 
performance of the diffusion strategy algorithm with 
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Fig. 7: The network performance comparison with non- 
cooperative diffusion LMS and with diffusion LMS over 
ideal communication links. 


perfect CSI. 

We have also produced a transient MSD curve using 
standard diffusion LMS [2], under similar fading con¬ 
ditions and noise. The results showed that the network 
MSD grows unbounded (i.e., error —oo). This problem 
can be justified using the fact that some nodes, in 
the combination step, use severely distorted data from 
neighbors with bad charmel conditions and low SNR. 
Consequently a large error is introduced into their up¬ 
dated intermediate estimates, which then will propagate 
into the network in the following iterations and cause 
catastrophic network failure. 

In Fig. 7, we compare fhe performance of diffusion 
sfrategies for different ranges of SNR over the net¬ 
work. We also make some comparisons between the 
cooperative and non-cooperative networks where in the 
latter case the network runs a stand-alone LMS filter at 
each node, which is equivalent to running the diffusion 
strategy with Ai = I. In Fig. 7, the SNR index n € 
{1,2,-•• ,7} over the x-axis refers to the n-th network 
SNR distribution, as obtained by uniformly scaling up 
the initial SNR distribution over the network by 5dB 
for each incremenf in fhe integer n, as represented by 
SNR„ = SNRini -I- 5u(dB), where SNRmi are the SNR 
of fhe connected nodes illustrated in Fig. 2, and are 
obtained from uniformly disfribufed random variables 
in the range between [5 lOjdB. 

As shown in Fig. 7, the performance of non- 
cooperative adaptation and diffusion LMS wifh ideal 
communication links remains invarianf with changes in 
the SNR values. This is expected since the performance 
of the diffusion LMS in fhese cases is not affected by 
the communication noise, and v^^-. In comparison, 
the performance of fhe modified diffusion sfrafegy over 
wireless links depends on the CSI. As the knowledge 
about the network CSI increases, the performance im¬ 


proves. From fhis resulf, we observe thaf af low SNR 
the performance discrepancies between diffusion with 
perfect CSI and diffusion with channel estimation is 
larger compared to high SNR scenarios. This difference 
in performance can be reduced by using more pilot 
data to estimate the channel coefficients in each time 
slot. In addition, at very low SNR, we see that the 
non-cooperative case outperforms fhe modified diffusion 
sfrafegy. This resulf suggests that in wireless networks 
with high levels of commimicafion noise af all nodes 
(e.g., when the nodes transmit power is very low), to 
maintain a satisfactory performance level fhe nefwork 
must switch to the non-cooperative mode. This also 
suggests that if fhe transmit power of some nodes is 
below some fhreshold value, these nodes should go to a 
sleep mode in order to avoid error propagation over the 
network. 

6 Conclusion 

We extended the application of diffusion LMS strategies 
to sensor networks with time-varying fading wireless 
channels. We analyzed fhe convergence behavior of the 
modified diffusion LMS algorifhms, and established con¬ 
ditions imder which the algorithms converge and remain 
stable in the mean and mean-square error sense. The 
analysis revealed that the performance of fhe diffusion 
sfrafegies highly depend on the level of CSI knowledge 
and the level of communication noise power over the 
network. In particular, when the CSI are known, the 
modified diffusion algorifhms are asymptotically unbi¬ 
ased and converge in the slow adaptation regime. In 
contrast, the parameter estimates will become biased 
when the CSI are obtained through pilot-aided channel 
estimation. Nevertheless, the size of the bias can be 
made small by increasing the number of pilot symbols 
or increasing the link SNR. 
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Appendix A 
Computation of ^ 

To obtain Ry^k in (73), we need to compute the expecta¬ 
tion 






(108) 


for i € Mk\k. For the case i = k, we have Ry^ik = 0 and 
hence the expectation of [a^fe(*) \gi i^{i)\^]Ry^tk in (73) is 
zero. For £ ^ k, we proceed as follows. Since fhe join! 
probabilify disfribution function of fhe numerator and 
denominator in (108) is unknown, the expectation can 
be approximated using one of fwo ways. In the first 
method, we can resort to computer simulations. In the 
second method, we can resort to a Taylor series approxi¬ 
mation as follows. We introduce the real-valued auxiliary 
variable x = aj/.{i). Considering the combination rule 
(22), the expectation of x when £ ^ k will be: 


E[a=] = lkPe,k (109) 

To compute the variance and expectation of fhe de- 
nominafor in (108), we lef fhe exponential distribution 
function fy{y) with parameter A given by (27) denote 
the pdf of y = |/i^,fc(i)p, i.e., 

fyiv) = , for y G [0, oo) (110) 

We also let fy^ (y) represent the pdf of y for y G [i^e^k, oo). 
If can be verified fhat fy\y) represents a truncated 
exponential distribution and is given by: 


fy\y) = ^e,ke for y G [vt^k, oo) (111) 

If we now define 


Pt 
' e,k 


Then, fhe pdf of 2 can be computed as [38]: 

dz 




where 


Therefore, 


dy ^“fc , -1. ^ ^tk 

^ = —ands W = 


( 112 ) 

(113) 

(114) 


' e.,k 


Uz) = ^ for z G 

(115) 

Using fhis distribution the mean and variance of 2 : will 
be [38]: 



(116) 

(117) 


We can now proceed to approximate the expectation 
(108) by defining 

f{x,z) = ^ (118) 

and employing a second order Taylor series expansion 
to write: 




E[a;] 1 

IR" (eR) 


jCOv{x,z) 


E[a;] 

W) 


^vai{z) 

(119) 


Substituting, E[a7], E[ 2 ;], cov(a;, z) and var( 2 ;) into (119), 
we then arrive at: 


Mf{x, z)] Ri E [al^ii) |y^,fc(f)P] 
/ 1 


lkpe,k ( 


1 




:) 


( 120 ) 


Appendix B 
Derivation of (80) 

First, we note that when Uk^i are zero mean circular 
complex-valued Gaussian random vectors and i.i.d. over 
time, then for any Hermifian matrix F of compafible 
dimensions if holds thaf [43]: 

E[Mfc ^Uk,iTul ^Uk,i] = l3{Ru,k^Ru,k) + i?„,feTr(ri?„,fe) 

( 121 ) 

where /3 = 1 for complex regressors and /3 = 2 when fhe 
regressors are real. Using (121) and spatial independence 
of the regression data we have 

j] = Ru,k^Pu,i dkii^P ^)Ru,k^Pu.k 

+ SkeRu,kTr{r Ru,k) ( 122 ) 

where Ske is the Dirac delta sequence. To compute T, we 
first introduce 

C, = {I - Mni)Q{I - Mn,) (123) 

where Q is an arbitrary deterministic Hermitian matrix. 
We now note that 

bvec (E[£i]) = E [(/ - 7W7^,)^ Ob (/ - 7ti^7^,)] bvec(Q) 

T bvec(Q) (124) 

where (ii) obtained by comparing the expectation term 

on the right hand side of (i) with definition (78). We 
proceed by taking expectation of both sides of (123), i.e., 

E[£i] = Q- TZMQ - QMTZ + E [TZ^MQMn,] (125) 

To compufe the block vectorization of fhe lasf term on 
the right hand side of (125), we introduce the block 
partitioned matrix Q' = A4QA4 with blocks and 

use (122) to obtain (126), where = vec{Ru,k)- 
Now, using (125), we can write: 

bvec(E[£,]) =(/ - / Ob MTZ - TZ'^M Ob /)bvec(Q) 

+ bvec(E[7^,Q'7^,]) (127) 















(M A^)bvec(Q) 
(126) 


r N 

hYec{E[lZiQ'TZi]) = < (7^'^(g)t,7^) + ^ |^diag((vec(diag(efe))) 0 (d - ^){Rk,u ® Rk,u) + rkrl 


k=l 


From (124), (126) and (127) and using the fact that the 
real vector space of Hermitian matrices is isomorphic to 
arrive at (80). 

Appendix C 
Computation of v 

We expand 1) = E[X>i] in (79) as: 

V = |E[Ai 0 A,] + E[Ai 0 E*J] + E[£;i 0 A,] 

+E[E,(8)Ef]^®lM2 (128) 

The (r, 2 ;)-th entry of E[Ai 0 Ai], denoted by fr,z, is: 

fr,z = E[ae^k{i)am,n{'^)] (129) 

where the relation between (r, z) and i, k) is: 

r = {I — 1)7V + TO, and z = {k — 1)N + n (130) 


The (r, 2 ;)-th entry of E[Ai 0 E*'^], denoted by Xr,z, 
can be expressed as: 




= -E 






|hm,n(*) + 


hm,nii) + -\/^'IJm,n(*)|^ > l'm,r. 


(138) 


Likewise, the entries of E[£Ji0Aj] and E[£Jj0£^*^] can be 
expressed in terms of the combination weights, charmel 
coefficients and the estimation error. We can follow the 
argument presented in Remark 3 to show that the right 
hand side of (138) as well as the entries of E[£^i 0 Ai] 
and E[£^i 0 E*'^] are invariant with respect to time and 
have finite values. 


When k ^ n, entries ai^k{i) and am,n{i) come from 
different columns of Ai and are independent. Hence, in 
this case, we can write: 


fr,z = E[ae^k{i)] E[a„,,„(i)] (131) 

with 

{ 1- X] PrQlrq, if j = 9 

rGW,\g (132) 

Pjq'jjq, otherwise 

When k = n, the entries a^,fe(i) and a™ „(z) come from 
the same column of Ai and may be dependent. In this 
case, there are four possibilities: 

(1) if f = TO and £ ^ k: 

fr,z = lt^kP(-,k (133) 


(2) if f = TO and £ = k: 

fr,z=E^(l- X X 

f.&N'kXk l^Mk\k 

= 1-2 X Pi,khe,k - ilk) - X Plkllk 

e&AfkXk t&Mk\k 

+ X X P£,,kPm,k^i,k'Jm,k 

{^GMk\k) {mGMk\k) 

(3) if ^ ^ TO and £ ^ k and m ^ n: 


(134) 


(135) 


fr,z — 'y£,k^m,nPi,kPm, 

(4) if ^ ^ TO and £ = k and m ^ n'. 

fr,z = E ^1 ^ ^ ££’j^k{P))^Tn,n{j£) 

j€Af\k 

— Tm.nPm.n^l 'Jm.n T ^ ^ ^j,kPj,k 
j&J\fk\{k,m} 


(136) 


( 137 ) 












